Skip to content

Filter runtime guard logs by ready revision#277

Merged
Pigbibi merged 1 commit into
mainfrom
codex/filter-runtime-guard-stale-logs-20260701
Jul 1, 2026
Merged

Filter runtime guard logs by ready revision#277
Pigbibi merged 1 commit into
mainfrom
codex/filter-runtime-guard-stale-logs-20260701

Conversation

@Pigbibi

@Pigbibi Pigbibi commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • pass CLOUD_RUN_REGION into the runtime guard workflow
  • resolve the latest ready Cloud Run revision for each service
  • use the latest ready revision creation time as the lower bound for Cloud Run log queries, falling back to the original lookback window when metadata is unavailable
  • add focused tests for revision cutoff and target-specific region resolution

Validation

  • python -m py_compile scripts/cloud_run_runtime_guard.py tests/test_cloud_run_runtime_guard.py
  • uvx ruff check scripts/cloud_run_runtime_guard.py tests/test_cloud_run_runtime_guard.py
  • stdlib inline simulation of _cloud_run_log_since

@Pigbibi Pigbibi merged commit 22cb11d into main Jul 1, 2026
2 checks passed
@Pigbibi Pigbibi deleted the codex/filter-runtime-guard-stale-logs-20260701 branch July 1, 2026 06:56

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 82f8fb85f9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +275 to +276
if revision_start and revision_start > fallback:
return revision_start

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use serving revisions when narrowing the log window

When a service has a latest ready revision that is not receiving 100% of traffic, such as a no-traffic deploy or traffic split, this cutoff is applied to every log for the service even though the query still only filters by service_name. That can hide failures from the currently serving older revision earlier in the lookback window as soon as a newer ready revision exists; base the cutoff/filter on status.traffic serving revisions, or include the resolved revision_name if the guard is intentionally checking only that revision.

Useful? React with 👍 / 👎.

metadata = revision_payload.get("metadata") or {}
if not isinstance(metadata, dict):
return None
return _parse_timestamp(metadata.get("creationTimestamp"))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Cut off at the ready transition time

When the new revision takes time to become Ready, metadata.creationTimestamp is before the Ready/ContainerHealthy transition, so startup or readiness-check errors emitted after creation still pass the timestamp filter. With RUNTIME_GUARD_IGNORE_PRE_READY_REVISION_LOGS enabled this can still alert on transient pre-ready logs for the revision that eventually became ready; use the ready condition transition time as the lower bound, falling back to creation only if that timestamp is unavailable.

Useful? React with 👍 / 👎.

)

for service in services:
service_since = _cloud_run_log_since(project, service, since) if ignore_pre_ready_logs else since

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep the success check on the configured lookback

When RUNTIME_GUARD_REQUIRE_SUCCESS=true and a new revision became ready inside the lookback window but has not received traffic yet, this same shortened service_since window is used for counting successes. That excludes successful requests earlier in the configured lookback and emits a no successful Cloud Run request ... in the last N minutes alert even though the requirement was satisfied; keep the success query on since or change the alert semantics/message to require a post-ready success.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant